73 research outputs found
Dictionary learning with large step gradient descent for sparse representations
This is the accepted version of an article published in Lecture Notes in Computer Science Volume 7191, 2012, pp 231-238. The final publication is available at link.springer.com
http://www.springerlink.com/content/l1k4514765283618
A high-quality video denoising algorithm based on reliable motion estimation
11th European Conference on Computer Vision, Heraklion, Crete, Greece, September 5-11, 2010, Proceedings, Part IIIAlthough the recent advances in the sparse representations of images have achieved outstanding denosing results, removing real, structured noise in digital videos remains a challenging problem. We show the utility of reliable motion estimation to establish temporal correspondence across frames in order to achieve high-quality video denoising. In this paper, we propose an adaptive video denosing framework that integrates robust optical flow into a non-local means (NLM) framework with noise level estimation. The spatial regularization in optical flow is the key to ensure temporal coherence in removing structured noise. Furthermore, we introduce approximate K-nearest neighbor matching to significantly reduce the complexity of classical NLM methods. Experimental results show that our system is comparable with the state of the art in removing AWGN, and significantly outperforms the state of the art in removing real, structured noise
Transformation Equivariant Boltzmann Machines
Abstract. We develop a novel modeling framework for Boltzmann machines, augmenting each hidden unit with a latent transformation assignment variable which describes the selection of the transformed view of the canonical connection weights associated with the unit. This enables the inferences of the model to transform in response to transformed input data in a stable and predictable way, and avoids learning multiple features differing only with respect to the set of transformations. Extending prior work on translation equivariant (convolutional) models, we develop translation and rotation equivariant restricted Boltzmann machines (RBMs) and deep belief nets (DBNs), and demonstrate their effectiveness in learning frequently occurring statistical structure from artificial and natural images
Combining visual and textual systems within the context of user feedback
It has been proven experimentally, that a combination of textual and visual representations can improve the retrieval performance ([20], [23]). It is due to the fact, that the textual and visual feature spaces often represent complementary yet correlated aspects of the same image, thus forming a composite system.
In this paper, we present a model for the combination of visual and textual sub-systems within the user feedback context. The model was inspired by the measurement utilized in quantum mechanics (QM) and the tensor product of co-occurrence (density) matrices, which represents a density matrix of the composite system in QM. It provides a sound and natural framework to seamlessly integrate multiple feature spaces by considering them as a composite system, as well as a new way of measuring the relevance of an image with respect to a context. The proposed approach takes into account both intra (via co-occurrence matrices) and inter (via tensor operator) relationships between features’ dimensions. It is also computationally cheap and scalable to large data collections. We test our approach on ImageCLEF2007photo data collection and present interesting findings
Bayesian Painting by Numbers: Flexible Priors for Colour-Invariant Object Recognition
Generative models of images should take into account transformations of geometry and reflectance. Then, they can provide explanations of images that are factorized into intrinsic properties that are useful for subsequent tasks, such as object classification. It was previously shown how images and objects within images could be described as compositions of regions called structural elements or ‘stels’. In this way, transformations of the reflectance and illumination of object parts could be accounted for using a hidden variable that is used to ‘paint’ the same stel differently in different images. For example, the stel corresponding to the petals of a flower can be red in one image and yellow in another. Previous stel models have used a fixed number of stels per image and per image class. Here, we introduce a Bayesian stel model, the colour − invariant admixture (CIA) model, which can infer different numbers of stels for different object types, as appropriate. Results on Caltech101 images show that this method is capable of automatically selecting a number of stels that reflects the complexity of the object class and that these stels are useful for object recognition.Engineering and Applied Science
Deep Learning of Representations: Looking Forward
Deep learning research aims at discovering learning algorithms that discover
multiple levels of distributed representations, with higher levels representing
more abstract concepts. Although the study of deep learning has already led to
impressive theoretical results, learning algorithms and breakthrough
experiments, several challenges lie ahead. This paper proposes to examine some
of these challenges, centering on the questions of scaling deep learning
algorithms to much larger models and datasets, reducing optimization
difficulties due to ill-conditioning or local minima, designing more efficient
and powerful inference and sampling procedures, and learning to disentangle the
factors of variation underlying the observed data. It also proposes a few
forward-looking research directions aimed at overcoming these challenges
Recommended from our members
On Computationally-Enhanced Visual Analysis of Heterogeneous Data and Its Application in Biomedical Informatics
- …